Сергей Яковлев:Статья:AsinhronEng
= CONSTRUCTION AND THE ANALYSIS OF ASYNCHRONOUS NEURAL NETWORKS = S.Jakovlev Keywords: recognition of images, neural networks, generalization 1. Introduction Any work on subjects of artificial neural networks 2, 3, 4 shows to some extent (with that or other success) modeling of work of a human brain. Therefore the understanding of such concepts, as memory, training, law, comprehension and of some others is important. Minsky also mentions it 5: “Immaturity which demonstrates our inability to answer such questions is shown even in language on which questions are formulated. Pairs of contrastive words of type "parallel" - "consecutive", "local" - "global", are used so as if they are related to precisely certain technical concepts. … The task consists in how to unite them into the clear, distinct theory.” So, for example, the architecture of neural networks already represents a certain understanding about memory, and the system of reinforcement with correction of a mistake gives understanding about a way of training and how communications in memory are formed. But the understanding of these terms is important to us only in a definite way. We are attracted not only by the form of these concepts which can be investigated in philosophy, distracting from any content. That speculation by the maintenance of these concepts which is used in psychology is not so interesting. Besides even strict studying of these concepts in medicine is not essential to us, since dividing a subject into parts which can be studied on the basis of experience, we cease to understand, what is put into such abstract concepts as memory, training, consciousness, etc. It occurs because of impossibility to experiment a full interaction of all parts. What way of studying of these concepts do we use? We are interested in these concepts from the so-called technical and informational sense which can be received, investigating various models, trying to establish an information opportunity and suitability for technical sums (as a necessary, but far not sufficient condition of real work of a brain). The area of these technical sums, however, is a little bit especial, firstly, being sums of recognition, and secondly - sums of management (by the way, sums of forecasting are already consequence of it). But for effective operation, one skill to identify is not enough, and it is necessary to have the connected representation of the given stimulus with many other stimuli from an environment. Such general idea about subjects which subsequently can give comprehension of certain laws by system, cannot be achieved, using only synchronous kind of artificial neural networks (more strict definition is given in following section), i.e. such networks in which it is supposed, that signals go without a delay (where simultaneously varies the condition of all neurons at once). It is clear, that such networks are only mathematical idealization since any technical signals cannot spread with instant speed, and signals in biological neural networks go with the certain delay (time of reaction for a signal here depends on quantity of the processable information). Besides it would seem, it can be regarded only as technical restriction, but actually, the natural delay in networks leads to connection of the first stimulus with the second, and then with the third. Really, extending on connection between neurons asynchronously, neurons react not to the selected stimulus separately, but on a complex of the previous stimulus. Thus, asynchronous artificial neural networks (see section 2) show us the essence of causally - investigatory communication. Currently the prevailing part of researches is only devoted to synchronous artificial neural networks since they are already known and easy in application, as opposed to asynchronous artificial neural networks where there is no detailed classification and developed methods of training. There is an opinion, that asynchronous networks are strongly astable and complex for forecasting convergence at training. Given article is an attempt to fill the existing blank, having offered a systematic construction of architecture of such networks. It will allow applying the reinforcement system with correction of a mistake. Base compact examples will show suitability of asynchronous networks in the doing of complex (nonlinear) sums, with that advantage, that the mechanism of work of the given networks becomes clearer (interpreted). Thus the offered structural changes allow increasing informative capacity of a network. 2. The notion of synchronous and asynchronous networks As the definition of concepts synchronous - asynchronous for neural networks in the modern literature is very indistinct, we shall address to the work of Rozenblatt 1. He does not give their definitions directly, as he does not use them in his work. But giving the classification of concepts of neural networks, he supposes an opportunity of existence of asynchronous neural networks. This is clearly seen from the abstract of his work given below. “Definition 12. Transferring functions of communications in perceptron depend on two parameters: time of transfer of an impulse on liaison channels and coefficient or weight of liaison . Transferring function of communications from the element to the element looks like ” From other definitions of the same work follows: - value of the signal which has left the element ; - value of the signal which left the element at a certain moment in the past, lagging behind from the current signal on a delay which is required for a passage of a signal from the element to the element - for simplification we shall call this value of a proceeding signal; - a certain function from two variable weights of communication and value of a proceeding signal. As a rule, this function is presented by the operator of multiplication between these variables; - Value of the transferring function, being a value which will be submitted on the input of the element . Now we can formulate a concept of synchronism and asyncronism: Synchronous is the name given to such network which time of transfer of each communication is equal either to zero, or to the fixed constant . Asynchronous is the name given to such network which time of transfer for each communication between elements and is its own but constant as well. 3. Difficulties of Revealing of Laws in Synchronous Artificial Neural Networks Wishing to understand on which basis the artificial neural network (ANN) could make the conclusion about whether there is a certain law in shown stimulus, the author investigated the representatives of binary synchronous ANN, as it is more difficult to analyze the work of analog networks (process of their work is not so transparent to make it possible to analyze an opportunity of revealing of laws). Two types of binary synchronous ANN could be distinguished: # perceptron networks; # Optimizing networks (filters) - angular classification (CC4) 2, Hopfield’s and Haming’s networks and also networks of an adaptive resonance 6, etc. Filters are strictly determined systems that do not allow application of a reinforcement system with correction of a mistake which is a serious drawback for minimizing a redundancy of system. Besides memory in filters is not distributed, but a quite concrete neuron of a recognition layer answers some set category. At its destruction the memory of the all category is lost. This peculiarity, however, does not allow to speak about filters (for example, about networks of the adaptive resonant theory), as about direct models of biological neural networks where memory is distributed. Therefore, investigating a classical perceptron (as the most perspective system for our goals), we can assume that the law which is available in entrance stimulus, can be discovered in weights after training. But by more detailed consideration this assumption should be rejected for the following reasons. The first layer of weights in a perceptron 1, as is known, is defined by a casual image as +1 (exciting communication) or-1 (braking communication). The second layer of weights is corrected by means of system of reinforcement. The role of the first layer is significant enough, since its absence would lead to impossibility of the doing of nonlinear sums. Thus, the first layer plays a role of the function translating signals of stimulus from a linear representation into nonlinear, which allows a number of "full" sums to converge at training. But as weights of the first layer are defined by casual image, signals passing through this layer, lose any hint on law, thus, no law can be received or restored at training of the second layer. Consequently, the first layer in a perceptron, being a necessary function at recognition, brings to chaos all the representations containing in primary stimulus even if they contain any law. Further it would be possible to think, that if to define weights of the first layer not casually, but by any rule, the given problem of “degeneration of law " would have the decision. But it astonishingly appears that systematic definition of weights of the first layer does not allow distinguishing submitted stimulus in most cases. It occurs due to the activity of A - elements which in some cases are identical at different stimulus. Thus, when we install weights by rules, but not casually, a number of the defined restrictions should be imposed on these rules in the following way: # Activity of every A- element must not be equal to zero or one, since the absence of activity does not even give an opportunity to distinguish, and full activity does not give an opportunity for training selectively; # Not a single set of communications should be identical, as it will not allow to distinguish the activity arising from one stimulus from activity of other stimulus; # In weights of different sets of communications there should not be little information, i.e. the number of certain bits in these weights should not be always identical (for example, some senior bits are negative in all stimulus), and at least so much bits should be involved that it was possible to present variously a necessary number of stimulus. # Thus, a necessary number of bits in all sets should be various, otherwise, the activity is absent or it is identical on different stimulus # There should not be symmetry in weights of communications since it leads to zeroing of activity. # Besides there can be special cases when the activity of A-elements on one stimulus becomes a group symmetric on activity of A - elements at submission of other stimulus, owing to a combination, for example, of three (or more) sets of weights. These restrictions on rules lead to that, for example, when from 4096 possible variants less than 32 variants are suitable, and it is necessary to select the certain threshold. It means, that the construction of such rules which would consider these restrictions (even if the probability not to get into them increases with a number of communications) practically represents a serious technical problem. As to simple rules they are applied in filters anyway but as it was already said, they have other drawbacks. From here we can make though not categorical, but practically a substantiated conclusion, that in synchronous neural networks it is possible to keep law which has the place in stimulus only in special cases, when there is no casual selection of weight, and instead of it the certain rule of formation of weight of the first layer is found. 4. Regularity Detection (RD) Method 4.1. General regulations Projecting RD method, the author started with the assumption, that if it is impossible to find law in weights of a trained layer, for certain it is possible to construct a system that the activity of A - elements expressed or at least kept that law which is present at stimulus. But for this purpose it is necessary to replace the function of generalization of the first layer for something that the network could solve nonlinear problems. Therefore there is no first layer in described architecture of a neural network, i.e. stimulus directly associate with outputs, but every S - element (being thus A-element) is connected with at least one A -element by communication with the certain delay (in a simple case ). The example of architecture of a neural network in RD method is represented in figure 1. Figure 1. Architecture of Neural Network in RD Method 4.2. Task “XOR” For demonstration of suitability of the given method in the doing of nonlinear sums we shall consider an elementary task "XOR". Besides it will allow us to see the given method in operation. The architecture of a neural network for this task is presented in figure 2. The table of the validity of function "XOR" is presented in table 1. Table 1 The Table of the Validity of Function "XOR" Input X1 Input X2 Output Y 0 0 0 0 1 1 1 0 1 1 1 0 Initial values of weights and threshold for all elements (here and in the further examples) is equal to zero. Then the matrix SA (the matrix of activity of A - elements on each stimulus) will be such, as shown in table 2. Table 2 Matrix SA in the Task “XOR” A1 A2 A3 A4 S1 0 0 0 0 S2 1 1 0 0 S3 0 1 0 1 S4 1 0 0 0 From the matrix SA it is seen, that the delay has such an effect, that if the signal is present at least two times in a row on inputs (A1, A2), the signal also appears on the corresponding A - element connected with an input. There is no activity at submission of the first stimulus, it means that it will be carried to a negative class, and in training it will influence only the activity at other stimulus (i.e. it will dump the activity of the following stimulus in A- elements). Here the third A - element does not become intense at any single stimulus, it means that it is superfluous at training. The process of a finding of weight by system of reinforcement with correction of weights is shown in table 3. Table 3 Process of a Finding of Weight in Task "XOR" Iteration 1 Iteration 2 Iteration 3 Iteration 4 Iteration 5 W1 -1 -1 0 * * 1 0 0 1 0 * 1 * * * W2 -1 0 0 * * 0 -1 0 0 -1 * -1 * * * W4 0 1 1 * * 1 1 2 2 2 * 2 * * * St 2 3 4 2 3 4 2 3 4 2 3 4 2 3 4 * - There are no changes since the stimulus is distinguished correctly. 4.3. Peculiarities of multidimensional tasks The main problems at modeling performance of multidimensional tasks are technical ones: the lack of speed or absence of the accessible size of memory. The approximate limit is the task with 16 binary inputs, 65536 A- elements and 8 binary outputs (under condition of fully connected architecture of a neural network). Due to the big number of communications they become difficult for storing in memory, thus they should either be kept on a disk (that also slows down the work of the program) or to recalculate the big number of times after each display of stimulus; at 65536 stimuli usually it is required not less than 1000 iterations for training. Modeling of one iteration (on a modern personal computer) in such task can take about several minutes, and on all cycle of training (by some optimization) it can take some days. And in essence the size of such task is not so big, since it corresponds to full training of one image in the size of 256х256pixels with 256 colors. In offered RD method this problem becomes even more serious as we have actually replaced (have complicated) a way of adjustment of weight of the first layer, i.e. the function of reception of SA - communications has become complicated since we have replaced rather fast function of reception of a random number with the slow function analyzing some stimulus for reception of the answer. In this method we can speak without serious difficulties about tasks with 16 inputs, up to 1000 A- elements and 8 outputs. Therefore we shall analyze here a work of RD method at 8 inputs on which numbers from 0 up to 255 will move, the number of A - elements will be no more than 200, the number of stimulus will be 256 casual color points, which color (with 256 colors ) needs to be received at 8 outputs. The following basic problem becomes a problem of convergence. It occurs because, that to generate communications of the first layer at least 256 A- elements are required to analyze the big number of stimulus. Under the law of convergence 1 - each stimulus requires up to one A-element; in practice, however, due to training, 60-80 % from the number of stimulus is enough. The analysis of only current and previous stimulus gives 8 A – elements if to analyze only first bit. It is also possible to analyze some bits, revealing laws of occurrence of a constant signal of activity not on one input, but on the several inputs at once. And also it is possible to analyze some stimulus at the same time (by means of a delay). Therefore here we shall offer a construction of communications of the first layer of A - elements which has recommended it in multidimensional tasks. This way is based on revealing of symmetry in stimulus. For this purpose after the display of stimulus each input is compared consistently to other inputs, forming a matrix shown in figure 3. On the crossing a unit is established (symbolizing the activity of A - element) if two compared values coincide. Then it is possible to compare some bits at once, for example, the first and the second with the second and the third and only if two results of the comparison are true, the activity of A- element is established (fig. 3.). Thus, it is desirable to compare up to ½ from the number of S - elements. And it means that it is possible to generate the activity of 112 A - elements at eight S - elements. Fig. 3. A way of formation of A - elements activity on the basis of revealing of symmetry in stimulus. In the example the stimulus 01011001 is submitted; at the left only one bit participates in comparison only, on the right- two bits. Convergence as a result of the decision of the given problem is represented on fig. 4. A Fig. 4. The following formula is generally used . It can appear to be insufficient for multidimensional tasks and then it is necessary to combine this approach with approaches in a way of activation of A - elements with the asynchronous methods described above. But for our demonstrative task though the number of A - elements is almost twice less than the number of stimulus - and it suffices, so it is the good characteristic (namely greater informative capacity of a network) of this approach in a way of activation of A – elements. Convergence at the doing of a multidimensional task is shown in figure 4. Figure 4. Convergence at the doing of a multidimensional task 4. Conclusion The offered regularity detection method allows not losing (during the passage of a signal on the first layer of a neural network) the law which is in the input stimulus. Besides, such approach allows doing nonlinear sums. Therefore the method described here has advantage over other kinds of neural networks, for example, perceptrons. The advantage is that this method is more transparent algorithm. And it allows interpreting work of the offered network. Advantages of asynchronous neural networks before synchronous are shown. They consist in an establishment of communication (law) between various stimuli. The negative sides of the given approach can be the reduction of speed of training. But there is hope to eliminate it in the subsequent versions. Despite the reduction of speed of training, the positive tendency in increase of informative capacity in the offered network, in comparison with classical perceptron is noticed. Thus, such network demands smaller memory size or allows remembering greater number of stimulus at training. 5. The bibliography list # Розенблатт Ф. (1965). Принципы нейродинамики (Перцептроны и теория механизмов мозга), Мир, Москва. # Subhash C. Kak. (1998). On generalization by neural networks, Information Sciences, 111, P.293-302. # Fausett L. (1994). Fundamentals of Neural Networks: Architectures, Algorithms and Applications, PrenticeHall International Inc. # Rojas R. (1996). Neural Networks. A Systematic Introduction. Berlin, Springer – Verlag. # Минский М. (1971). Перцептроны, Мир, Москва. # Вассерман Ф. (1992). Нейрокомпьтерная техника: теория и практика, Мир, Москва. Sergey Jakovlev, Mg.sc.comp., Institute of Information Technology, Riga Technical University, 1 Meza Street, Latvija, e-mail: sergeyk@fis.lv, tac@inbox.lv Jakovļevs Sergejs. Asinhronizācijas neirona tīklu veidoša un analīze Uzsvērta tehnisko rādītāju teorētiska nepieciešamība neirona tīklu arhitektūrā, kas norāda uz to ka pastāv noteiktas likumsakarība uzrādītajos stimulos. Precizēts sinhronizācijas- asinhronizācijas jēdziens. Izanalizēti sinhroni un asinhroni perceptroni, vai pastāv likumsakarību saglabāšanas iespēja, kāda pastāv ieejas stimulos, kas nepieciešams nākamai iespējamai analīzei. Ir piedāvāta regularity detection (RD) metode, kas ņem vērā sarežģītību virkni kāda pastāv citos neirona tīklu veidos, atklājot likumsakarības piedāvātajos stimulos. Izanalizēta RD metodes pielietošana nelineārajos daudz mērīngakajos uzdevumos. Piedāvāta metode ļauj palielināt tīkla informatīvo kapacitāti. Pateicoties tam tā ir daudz taupīgāks no atmiņas aizņemšanas viedokļa, bet pieprasa daudz ilgāko apmācības laiku. Sergey Jakovlev. Construction and the Analysis of Asynchronous Neural Networks Theoretical necessity of the presence of technical parameters for architecture of the neural networks speaking about existence of certain laws in produced stimulus is shown. The concept of synchronism – asynchronism is specified. Synchronous and asynchronous perceptrons for an opportunity of preservation of the laws which are available in entrance stimulus, for the subsequent possible analysis are examined. The regularity detection (RD) method which considers a number of the difficulties which are available in other kinds of neural networks, in revealing laws from the shown stimulus is offered. Application of RD method in nonlinear multidimensional sums is analyzed. The offered method allows increasing informative capacity of a network. Owing to this capacity it is more economical from the point of view of memory, but demands longer training. Яковлев Сергей. Построение и анализ асинхронных нейронных сетей Показана теоретическая необходимость наличия технических показателей в архитектуре нейронных сетей, говорящих о существовании определенных закономерностей в предъявляемых стимулах. Уточнено понятие синхронности – асинхронности. Проанализированы синхронные и асинхронные перцептроны на предмет возможности сохранения закономерностей, имеющихся во входных стимулах, для последующего возможного анализа. Предложен метод regularity detection (RD), который учитывает ряд сложностей, имеющихся в других видах нейронных сетей, в выявлении закономерностей из предъявляемых стимулов. Анализируется применение метода RD в нелинейных многоразмерных задачах. Предлагаемый метод позволяет увеличивать информативную емкость сети. Благодаря этому он более экономен с точки зрения памяти, но требует более длительного обучения. Ключевые слова: распознавание образов, нейронные сети, обобщение Категория:Сергей Яковлев